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I .  Introduction 


In  this  report  we  intend  to  exaunine  the  problem  of 
representing  mathematical  processes.  We  shall  consider 
this  i^roblem  in  the  context  of  digital  computer  soft¬ 
ware  and  hardware  —  both  because  the  availability  of  such 
computational  machinery  makes  this  the  most  useful  avenue 
of  approach  cind  because  this  computational  machinery  has 
played  such  an  important  role  in  shaping  the  way  in 
which  people  think  about  mathematical  processes.  In  this 
context  representations  of  mathematical  processes  are 
normally  called  algorithms.  The  word  'algorithm',  how¬ 
ever,  tends  to  have  a  much  narrower  meaning,  and  the 
restrictions  implied  by  the  use  of  this  word  are  built 
into  the  languages  in  which  algorithms  are  commonly 
formulated.  We  shall  begin  by  examining  briefly  the 
function  of  these  standard  representational  forms.  We 
shall  try  to  determine  exactly  what  representational 
restrictions  they  impose,  and  where  these  seem  un¬ 
justifiable,  we  will  propose  alternative  representational 
forms . 


II.  Conventional  Algorithmic  Representations 

Let  us  consider  a  typical  computing  situation.  A  human 
being  has  some  (perhaps  relatively  imprecise)  notion  of 
a  mapping  from  some  domain  of  inputs  to  some  range  of 
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outputs;  this  mapping  presumably  takes  form  in  his  mind 
as  a  sequence  of  transformations  on  the  inputs.  He 
formulates  this  mapping  precisely  as  an  algorithm  in 
some  computer-oriented  language  like  FORTRAN.  A  com¬ 
piler  then  translates  this  definition  of  the  mapping  into 
a  program  which  drives  some  computer  in  such  a  way  that 
it  performs  the  desired  mapping.  This  procedure  involves 
a  series  of  translations  —  from  human  notion  to  al¬ 
gorithmic  language  to  hardware  states.  For  these  treuis- 
lations  to  be  feasible  there  must  be  a  reasonable  simi¬ 
larity  between  the  way  in  which  human  beings  structure 
mappings,  the  structure  of  the  algorithmic  language,  and 
the  structure  of  the  computing  machinery. 

The  problem  is  that  in  designing  Icuiguages  to  express 
algorithms  (and  computers  to  perform  them) ,  we  have  two 
—  often  conflicting  —  aims.  The  first  of  these  aims 
is  to  provide  human  beings  with  the  most  convenient 
representational  medium  possible  for  the  definition  of 
mappings.  The  second  is  to  provide  a  representational 
form  which  can  be  conveniently  translated  into  the  most 
efficient  hardware  implementation  possible  with  respect 
to  space  and  time  (i.e.,  how  much  equipment  is  required 
for  how  long) . 

With  respect  to  the  first  aim,  a  number  of  criticisms 


can  be  made  of  algorithmic  languages.  For  purposes  of 
this  discussion,  however,  we  shall  assume  at  the  outset 
that,  at  least  for  a  large  and  interesting  class  of  prob¬ 
lems,  these  languages  —  particularly  with  respect  to 
their  fundamental  conceptual  organization  —  provide  the 
most  convenient  possible  representational  medium  for  the 
definition  of  input/output  mappings  by  human  beings.  We 
will  concern  ourselves  instead  with  the  second  function 
of  algorithmic  languages  —  that  of  providing  a  satis¬ 
factory  source  representation  for  the  translation  into 
the  most  efficient  possible  hardware  implementation.  We 
shall  argue  that  from  this  point  of  view  the  fundamental 
conceptual  view  of  mathematical  processes  which  underlies 
standard  algorithmic  languages  (and  machine  design)  is 
unsatisfactory.  We  shall  propose  a  representational  form 
with  a  different  conceptual  groundwork  and  demonstrate  the 
feasibility  of  translation  from  standard  algorithmic 
languages  into  this  representational  form.  We  shall  try 
to  indicate  both  how  this  representational  form  might 
enable  us  to  exploit  current  computing  machinery  more 
efficiently  and,  more  importantly,  what  implications  it 
might  have  for  t’^e  design  and  exploitation  of  more  power¬ 
ful  machinery. 

We  shall  begin  by  examining  in  some  detail  the  view  of 
mathematical  processes  which  provides  the  foundation  for 


algorithmic  languages  eind  machine  design.  Let  us 
consider,  for  example,  a  flowblock  diagram  of  an  al¬ 
gorithm  defined  in  some  lemguage  like  FORTRAN.  The 
diagram  is  a  directed  graph  whose  nodes  are  the  flow- 
blocks;  each  flowblock  contains  a  totally  ordered  set  of 
FORTRAN  statements.  The  flowblocks  are  connected  by 
directed  arcs;  each  eurc  is  an  output  of  exactly  one  flow- 
block,  and  an  input  to  exactly  one  flowblock.  Cycles  and 
loops  are  permitted.  Each  flowblock  has  at  least  one 
input  arc  and  at  least  one  output  arc  with  the  exception 
of  a  unique  flowblock  called  entry,  which  has  no  input 
arc,  and  a  unique  flowblock  called  exit,  which  has  no 
output  arc.  Since  this  diagram  is  to  be  a  representation 
of  a  process,  it  is  meaningless  without  some  sort  of 
simulation  rule.  This  is  provided  by  creating  an  entity 
called  control .  Control  can  be  thought  of  as  a  unique 
token  which  moves  through  the  diagram  in  discrete  steps, 
residing  at  any  given  time  at  exactly  one  statement.  To 
begin  simulation  of  the  algorithm,  control  is  placed  on 
the  first  statement  of  the  entry  flowblock.  Within  a 
flowblock,  control  moves  from  one  statement  to  its 
immediate  successor  (the  statements  vjithin  a  flowblock 
are  always  totally  ordered) ;  from  the  last  statement  of 
a  flovTblock,  control  may  move  along  any  one  of  the  output 
arcs  to  the  first  statement  of  some  other  (or  the  same) 
flowblock.  Each  time  control  resides  at  a  statement. 


that  statement  is  executed  exactly  once.  When  control 
arrives  at  the  last  statement  of  the  exit  flowblock, 
the  simulation  is  completed.  We  now  have  a  rough  picture 
of  an  algorithm  functioning:  a  unique  entity  named  control 
wanders  through  a  "flow  diagram"  bringing  to  life  one 
statement  at  a  time  as  it  drifts  by.  The  tv;o  most 
interesting  features  of  this  picture  are  (1)  that  at 
any  given  time  during  a  simulation,  control  resides  at 
exactly  one  statement  and  (2)  that  in  the  course  of  one 
simulation,  control  may  visit  the  same  statement  many 
times. 

We  must  now  examine  the  individual  statements.  In  order 
to  avoid  unnecessary  complications,  let  us  invent  a 
simplified  version  of  FORTRAN  which  permits: 

(1)  two  types  of  I/O  statements;  the  word  'READ'  followed 
by  exactly  one  variable-name,  and  the  word  'WRITE'  followed 
by  exactly  one  variable-name; 

(2)  assignment  statements,  consisting  of  exactly  one 
variable-name  followed  by  '='  followed  by  either  one 
variable-name  (or  one  integer)  or  else  by  two  variable- 
names  (or  two  integers,  or  one  variable-neime  and  one 
integer)  separated  by  an  arithmetic  or  Boolean  operator; 

(3)  control  statements  of  two  types:  'GO  TO'  followed  by 
a  statement-name,  and  'IF'  followed  by  a  variable-name 
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followed  by  three  statement-names. 

Let  us  look  first  at  a  typical  assignment  statement: 

A=B+C.  The  ' -ariable-names  (A  ,  B  ,  auid  C  in  this 
example)  act  as  "placeholders"  for  values.  We  could 
translate  this  statement  as  follows:  add  the  value 
currently  assigned  to  B  and  the  value  currently  assigned 
to  C  ;  assign  the  result  to  A  .  Hence,  we  call  A  the 
result  and  B  and  C  the  opercuids.  Once  control  en¬ 
counters  this  statement,  A  will  continue  to  "stand  for" 
the  value  assigned  to  it  by  the  execution  of  the  state¬ 
ment  until  control  encounters  another  assignment  to  A 
(or  re-encounters  the  same  assignment  to  A  ) .  In  other 
words,  any  variable-name  occvirring  on  the  right  side  of 
eui  assignment  statement  (i.e.,  as  an  opercmd)  represents 
the  result  of  the  most  recently  executed  assignment  to 
that  variable -name.  Because  the  same  variable-name  may 
be  designated  as  the  result  in  several  different  statements, 
and  because  control  may  pass  to  the  same  statement  more 
than  once,  a  given  variable-name  may  represent  many 
different  values  during  one  performance  of  the  algorithm,. 
However,  the  fact  that  control  can  reside  at  only  one 
statement  at  a  time  guarantees  that  at  any  given  time 
during  the  performance  of  an  algorithm,  a  given  variable- 
name  represents  (at  most)  one  value  (since  there  can  be  at 
most  one  most-recently-encountered  assignment  to  that 
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variable-name).  Note  that  the  concepts  control  and 
varicdale  are  interdependent. 

Initially,  at  least,  we  can  think  of  I/O  statements  as 
"incomplete"  assignment  statements.  A  READ  statement 
assigns  a  value  to  a  varicible-name;  a  WRITE  statement 
uses  a  variable-ncime  as  an  operand. 

Control  statements  do  not  directly  affect  the  values  of 
variables.  Instead  they  determine  the  path  of  control 
from  one  flowblock  to  another.  In  particular  an  ^  state¬ 
ment  uses  the  current  value  of  some  variable  (which  we  may 
think  of  as  the  operand  of  the  ^  statement)  as  the 
criterion  for  determining  v/hich  of  several  alternative 
"paths"  (i.e.,  output  arcs  from  the  flowblock)  control 
will  take.  Consequently,  such  statements  are  commonly 
called  "decisions". 

We  now  have  a  conceptually  complete  model  of  an  algorithm. 
If  we  eliminate  the  two-dimensional  aspects  of  this  picture 
by  arranging  the  statements  in  a  list,  we  have  a  typical 
algorithm  definition  in  an  algorithmic  language.  One 
statement  in  the  list  can  be  designated  the  initial  state¬ 
ment  and  another  the  terminal  statement.  To  "run"  the 
algorithm  we  place  our  "control  token"  on  the  initial 
statement.  Control  then  moves  down  the  list  executing 
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one  statement  at  a  time;  some  of  the  statements  may  be 
control  statements  whose  execution  may  cause  control  to 
be  sent  to  some  statement  other  than  the  immediately 
subsequent  statement;  when  control  arrives  at  the  ter¬ 
minal  statement,  the  "run"  is  completed.  Hardware  per¬ 
formance  of  an  algorithm  is  pictured  with  the  same 
conceptual  machinery.  A  program  consists  cf  a  set  of 
instructions,  each  of  which  occupies  one  location  in 
memory.  Memory  is  totally  ordered  —  each  location  has 
a  unique  successor.  There  is  a  control  counter  which 
(roughly  speaking)  always  contains  the  memory  address  of 
the  next  instruction  to  be  executed  and  there  is  some 
sort  of  central  processor  which  executes  the  instructions 
one  at  a  time.  Each  time  an  instruction  is  executed,  the 
address  in  the  control  counter  is  incremented  so  that  it 
contains  the  address  of  the  next  location  in  memory.  The 
execution  of  a  transfer  instruction,  however,  may  place 
the  address  of  some  other  memory  location  in  the  control 
counter.  An  instructio..  which  specifies  the  execution  of 
some  arithmetic  or  logical  operation  may  designate  a 
memory  location  whose  contents  are  to  be  used  as  an  operand 
or  an  instruction  may  designate  a  memory  location  in  which 
the  result  of  some  such  operation  is  to  be  stored.  Thus 
memory  locations  may  be  used  very  naturally  in  a  way 
analogous  to  variable-names  in  an  algorithmic  language 
representation.  In  general,  the  translation  from 


algorithmic  language  to  hardware  will  be  relatively 
straightforward. 


This  conceptual  machinery  for  representing  mathematical 
processes  is  in  some  respects  extremely  powerful.  The 
fact  that  during  one  performance  of  an  algorithm  the  same 
statement  may  be  executed  many  times  and  that  the  same 
instance  of  a  variable-name  may  represent  a  different 
value  each  time,  means  that  a  relatively  small  set  of 
statements  may  represent  an  arbitrarily  long  sequence  of 
different  computations.  In  hardware  terms  this  means 
that  a  relatively  small  computing  device  can  be  programmed 
to  perform  a  relatively  long  and  varied  sequence  of  oper¬ 
ations.  Let  us  illustrate  more  clearly,  with  the  help  of 
a  simple  example,  exactly  how  this  conciseness  is  achieved. 
Consider  the  following  algorithm  consisting  of  five  state¬ 
ments  : 


READ  A 
READ  B 
A=A-B 
A=A*B 


WRITE  A 
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Svqppose  that  the  input  domain  is  defined  as  follows: 
the  possible  input  values  of  A  are  the  integers  1,  2, 
and  3;  the  possible  input  values  of  B  are  the  integers 
1  emd  2.  The  algorithm  could  then  be  thought  of  as 
representing  the  following  mapping: 


However,  the  algorithm  defines  the  mapping  as  a  sequence 
of  arithmetic  transformations  on  em  ordered  input  pair  so 
that  from  this  point  of  view  we  may  think  of  the  algorithm 
as  representing  a  set  of  computational  histories  —  one 
for  each  input  pair  in  the  domain,  as  follows: 


INPUT  1,1 

INPUT  1,2 

INPUT  2,1 

INPUT  2,2 

INPUT  3,1 

INPUT  3,2 

0-1-1 

-1-1-2 

1-2-1 

0-2-2 

2-3-1 

1-3-2 

0-0*1 

-2—1*2 

1=1*1 

0-0*2 

2-2*1 

2-1*2 

OUTPUT  0 

OUTPUT  -2 

OUTPUT  1 

OUTPUT  0 

OUTPUT  2 

OUTPUT  2 

The  algorithmic  definition  provides  one  representation  for 
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all  of  these  oonputatlonal  hlatozies  by  what  we  will  call 
part-  t  ma telling:  "parts"  of  different  conqputational 
histories  are  "matched"  to  each  other  in  such  a  way  that 
all  the  computational  histories  may  be  "overlaid"  and 
given  one  representation.  Each  variable  name  then  re¬ 
presents  a  set  of  mutually  exclusive  values  (for  exanrole, 
in  any  given  performance  of  the  algorithm,  B  represents 
either  1  or  2),  amd  each  statement  therefore  represents 
some  set  of  mutually  exclusive  computations  (for  example, 

A-B  represents  either  1-1  or  1-2  or  2-1,  etc.,).  Let  us 
next  examine  the  role  of  control  with  the  help  of  a  slightly 
more  complicated  example: 
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We  will  now  take  for  granted  the  kind  of  part-part 
matching  discussed  in  the  preceding  paragraphs  and  con¬ 
sider  only  the  range  of  "control  histories"  which  this 
algorithm  represents.  By  "control  histories"  we  mean 
the  set  of  possible  paths  from  entry  to  exit.  (Each 
"control  history"  may  of  course  represent  many  computa¬ 
tional  histories./  Given  some  finite  domain  of  input 
values,  we  could  in  principle  represent  each  possible 
control  history  as  a  directed  graph  as  follows:  (see 
next  page ) 


If  we  took  such  a  set  of  all  possible  histories  as  our 
starting  point,  we  could  view  the  task  of  producing 
an  algorithmic  language  definition  of  the  algorithm  as 
a  process  of  further  part-part  matching:  we  first 
"fold  up"  each  of  the  individual  histories  by  matching 
different  parts  of  the  same  history  to  each  other,  as 


follows : 


In  history  j  ,  for  example,  we  have  matched  operation 
5^  with  operation  52?  since  operation  5^  had  operation 
4^  as  its  successor  and  operation  62  had  operation  6^ 
as  its  successor,  in  the  folded-up  representation  of 
the  history  operation  5  has  both  operation  4  and  operation 
6  as  successors.  We  can  now  "overlay”  all  the  folded  up 
histories  by  similarly  matching  peurts  of  different 
histories  to  each  other.  We  match,  for  example,  operation 
5  in  history  2  with  operation  5  in  history  j  .  The 
result  of  this  part-part  matching  is  of  course  equivalent 
to  a  flow  diagram  of  an  algorithmic  language  definition 
of  the  algorithm. 


16. 

IV.  Fundamental  Restrictions  Implicit  in 
Conventional  Representational  Forms~ 

Clearly f  the  part-part  matching  information  implicit  in 
standard  algorithmic  definitions  is  extremely  useful, 
and  we  will  wcuit  to  retain  it  in  any  alternative 
representational  form  we  might  propose.  Specifically, 
this  information  leads  to  a  very  efficient  use  of  space. 
Roughly  speaking,  standard  algoritlunic  formulations 
approach  maximum  efficiency  with  respect  to  space  and 
minimum  efficiency  with  respect  to  time.  They  make  very 
inefficient  use  of  time  because,  very  simply,  they  require 
that  only  one  thing  be  done  at  a  time.  The  notion  of 
control  as  a  unique  "entity"  which  passes  from  one 
statement  or  instruction  to  another  forces  a  total 
ordering  on  all  the  computations  in  the  performance  of 
an  algorithm;  every  history  will  necessarily  consist  of 
a  totally  ordered  sequence  of  operations.  Consider  tlie 
following  two  sequences  of  statements: 


A=B-C 

A=B-C 

x=a2 

Y=*^ 

y=/a 

x=a2 

Z=X+Y 

Z=X+Y 

These  two  sequences  are  computationally  equivalent;  an 
algorithm  writer  would  nevertheless  be  forced  to  choose 


one  of  them  arbitrarily.  Assuming  we  had  adequate  cc»i- 
puting  machinery,  however,  we  could  nudce  more  efficient 
use  of  time  by  performing  the  second  and  third  statements 
concurrently.  We  might  then  find  it  useful  to  exhibit 
this  possibility  explicitly  by  representing  these  four 
statements  as  a  partial  ordering  defined  by  the  data 
dependencies . 


i 


In  general  an  algorithm  consists  of  a  set  of  partially 
ordered  operations,  where  the  partial  ordering  is 
determined  by  the  data  dependencies.  We  can  obviously 
increase  efficiency  with  respect  to  time  by  performing 
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unordered  operations  concurrently. 

Furthermore,  once  we  admit  the  possibility  of  exploiting 
partial  ordering  information  by  performing  different 
operations  concurrently,  it  emerges  that  other  arbitrary 
restrictions  have  been  implicitly  imposed  by  the  standard 
representational  forms.  Consider,  for  example,  the 
following  net  model  of  a  three-stage  process: 
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The  three  stages  are  totally  ordered,  which  would  seem 
to  exclude  any  possibility  of  concurrent  operation.  Let 
us  assume  that  each  stage  takes  one  unit  of  time;  it 
will  therefore  take  three  units  of  time  for  each  mapping 
of  an  input  to  an  output.  On  the  other  hand,  if  we 
further  assume  that  inputs  can  be  provided  (and  outputs 
accepted)  by  the  environment  at  the  rate  of  one  per 
time  unit,  then  as  soon  as  the  first  stage  has  finished 
with  the  first  input,  it  may  accept  the  next  input.  Thus 
the  inputs  car  be  pipelined  so  that  all  three  stages  are 
operating  concurrently.  The  throughput  rate  would  then 
approach  one  output  per  time  unit.  If  we  could  abandon 
the  notion  of  control  we  might  similarly  be  able  to 
represent  the  possibility  of  pipelining  in  an  algorithmic 
context.  Suppose,  for  example,  that  the  net  diagram  above 
is  a  model  of  a  program  loop,  with  the  process  stages 
representing  the  individual  instructions  and  the  ordering 
relations  corresponding  to  the  data  dependencies  within 
the  loop.  Then,  despite  the  fact  that  the  instruction 
executions  (within  any  one  iteration  of  the  loop)  must 
be  totally  ordered,  all  the  instructions  may  be  performed 
concurrently  (by  overlapping  executions  from  successive 
iterations)  so  that  the  throughput  rate  is  limited  only 
by  the  duration  of  the  most  time-consuming  single  instruc¬ 
tion.  Similarly  we  could  represent  the  possibility  of 
pipelining  the  algorithm  as  a  whole  —  that  is,  the 
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possibility  of  an  algorithm  or  program  working  on  more 
than  one  input  set  concurrently. 


Another  interesting  restriction  implicit  in  the  standard 
view  of  algorithms  is  that  of  doing  only  what  must 
necessarily  be  done.  Consider  the  following  example: 


If  the  computation  of  I  is  extremely  lengthy  we  might 
profitably  "defer  the  decision”.  That  is,  while  we  are 
computing  I  and  then  "making  the  decision”,  we  might 
concurrently  pursue  both  of  the  alternative  br^mches  — 
even  though  one  of  them  will  turn  out  to  have  been 
"unnecessary".  We  could  then  use  the  result  of  the 


decision  to  choose  which  of  the  alternative  values  of  A 
we  wished  to  retain. 


Given  the  possibility  of  concurrent  operation,  we  might 
also  wish  to  question  the  automatic  one-one  mapping  of 
variable  names  to  equipment  locations.  Two  uses  of  the 
same  variable  name  might  be  entirely  unrelated  in  terms 
of  data  dependency  and  thus  potentially  concurrent  if 
mapped  to  different  equipment  locations.  Other  types  of 
space/time  trade-offs  might  also  become  more  interesting. 

A  computation  which  is  a  "bottleneck"  in  the  performance 
of  an  algorithm  might  be  "duplicated"  in  a  hardware 
representation . 

Many  such  possibilities  for  taking  advantage  of  partial 
ordering  information  and  potential  concurrent  operation 
are  already  exploited  to  a  limited  extent  on  both  the 
hardware  and  software  levels.  At  the  level  of  individual 
instruction  execution  there  is  normally  a  very  high 
degree  of  concurrent  operation,  of  course.  However,  at 
what  we  might  call  the  "programmable  level"  of  machine 
operation  there  is  very  little.  Machines  like  the  CDC  6600, 
the  CDC  7600,  and  the  IBM  360/91  permit  some  concurrent 
execution  of  instructions.  The  CDC  7600  furthermore,  has 
functional  units  which  may  be  pipelined  (so  that  a  given 
functional  unit  may  be  working  on  several  instructions  at 
a  time).  Programs  for  these  machines,  however,  must  consist 
of  totally  ordered  sets  of  instructions,  and  the  central 
processor  decodes  the  instructions  sequentially;  furthermore 


thfc  register  and  functional  unit  reservation  schemes 
impose  further  restrictions  on  parallel  operation. 
Consequently,  potential  concurrency  is  exploited  to  a 
very  limited  degree  and  only  very  locally.  Many  machines 
allow  concurrent  I/O  processing,  and  there  are  a  number 
of  machine  designs  which  allow  several  central  processors 
CO  pursue  loosely  related  computational  paths  concurrently. 
The  360/91  allows  a  kind  of  decision-deferral  or  "look¬ 
ahead"  which  involves  pursuing  the  "most  likely"  branch 
provisionally  before  a  conditional  branch  instruction  has 
actually  been  executed. 

Frequently,  even  though  machine  operation  is  sequential, 
one  sequence  of  operations  will  be  more  efficient  than 
another,  computationally  equivalent  sequence  (for  example, 
because  it  requires  less  intermediate  storage)  so  that, 
on  the  software  level,  there  are  a  number  of  optimization 
techniques  which  use  partial  ordering  infonnation  for 
resequencing.  The  same  kind  of  information  may  also 
expose  redundant  computations  —  caused  either  by  several 
redundant  expressions  or  by  an  expression  which  is  in¬ 
variant  within  a  loop.  Again,  these  techniques  are  applied 
independently  and  (except  for  recognition  of  invariance 
within  a  loop)  only  locally  —  usually  within  a  single  flow- 
block.  All  of  these  procedures,  both  hardware  and  software, 
are  closely  related;  they  are  piecemeal  attempts  to  provide 
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more  efficient  hardware  implementations  by  circumventing 
arbitrary  restrictions  imposed  by  the  representational 
forms  in  which  algorithms  are  defined.  No  consistent 
alobal  exploitation  of  these  possibilities  can  be 
achieved,  however,  because  the  necessary  information  is 
inaccessible  in  such  representations. 

V.  Partial  Ordering 

Many  optimization  techniques  involve  translations  of 
algorithms  (or  more  usually,  segments  of  algorithms) 
into  partial  orderings  representing  data  dependencies. 
However,  such  partial  ordering  techniques  normally  preclude 
statements  of  the  form  "  a  precedes  a  consequently 
cycles  must  be  excluded.  Furthermore,  no  attempt  is  made 
to  represent  the  interaction  of  decisions  with  data 
dependencies.  This  means  that  generation  of  all  partial 
ordering  information  for  an  algorithm  would  involve 
producing  a  partial  ordering  for  each  possible  control 
history  of  the  algorithm.  This  problem  remains  serious  even 
if  we  restrict  our  attention  to  one  program  loop.  Either  we 
must  limit  ourselves  to  one  iteration  of  the  loop  —  in  which 
case  we  exclude  all  information  about  concurrencies  across 
different  iterations  of  the  loop  (it  is  precisely  this 
information  vjhich  can  provide  us  with  "pipeline"  solutions) 

—  or  we  must  "unwind"  the  loop  (i.e.,  treat  it  as  one  long 
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"straight-line"  sequence  of  computations  rather  than  as 
a  loop),  which,  although  it  does  lead  to  explicit  repre¬ 
sentation  of  concurrencies  across  successive  iterations, 
necessarily  means  throwing  away  the  part-part  matching 
information. 

We  should  like  to  be  able  to  represent  algorithms  as 
partial  orderings  of  data  dependencies  without  sacrificing 
useful  part-part  matching  information.  We  will  therefore 
use  Petri  nets  as  our  basic  representational  medium.  Petri 
nets  can  be  used  to  exhibit  explicitly  both  partial  ordering 
and  part-part  matching  information  because  they  represent 
the  behavior  of  cyclic  systems  of  partially  ordered  events 
and  states.  For  a  brief  description  of  Petri  nets  see 
Appendix  I. 

Let  us  first  consider  the  use  of  Petri  nets  to  model 
algorithms  in  which  control  is  as  straightforward  as 
possible  —  that  is,  algorithms  without  any  conditional 
branches  and  hence  with  only  one  possible  control  history. 

We  can  represent  each  arithmetic  or  logical  operation  witli 
the  following  schema. 


operation  initiation 


o.i.p.  :  operation  in  progress 

o.i.p. operation  completed, 
not  in  progress 

opj^  :  operand available 

0P2  5  operand2  available 

op.'  :  operand,  used  in  oper- 

^  ation,  not  available 


opi  •  operandj  used  in  operation, 

^  not  available 

ur'  :  result  of  operation  may  be 
^  made  available;  use^^  of  pre¬ 
vious  result  (as  an  operand 
for  some  operation)  has  already 
taken  place 

ur^  :  result  of  operation  available 
^  for  usej^ 


Note  that  each  use  of  a  given  result  is  represented 
uniquely.  Accordingly,  each  use  of  a  variable-name  as 
an  operand  (i.e.,  each  occurrence  of  the  variable-name 
on  the  right  side  of  an  assignment  statement)  will  be 
represented  as  follows: 


i 

{ 

i 

operation^^  ,  which  generates  values  for  x,  in  progress 

operationj  r  which  uses  x  as  an  operand,  in  progress 

X  is  availaible  as  an  operzmd  for  operation.  .  The 
value  of  X  may  not  yet  be  changed.  ^ 

X  used  as  an  operand  for  operation.  .  The  value  of  x 
may  be  changed  as  a  result  of  operation.  . 


Note  that  the  schemata  for  operations  and  for  variable- 
uses  have  cyclic  behaviors  and  that  therefore  the 
algorithmic  representation  which  we  have  constructed  from 
them  also  behaves  cyclically.  One  important  consequence  of 
this  is  that  our  representation  expresses  not  only  "forward" 
data  dependencies  (the  computation  of  the  next  value  for  X 
cannot  begin  until  the  current  operand  value  for  A  has 
been  computed)  but  "backward"  data  dependencies  as  well  (the 
value  of  A  may  not  be  changed  to  its  next  value  until  the 
current  value  has  been  used  by  the  operations  which  compute 
X  and  y  ). 

As  long  as  an  algorithm  contained  no  decisions,  we  could 
apply  these  schemata  throughout  to  obtain  an  adequate 
representation.  As  soon  as  we  admit  branching,  however, 
this  procedure  is  no  longer  adequate,  because  decisions 
render  the  dependency  relations  variable.  Consider,  for 
example,  the  following  flowblock  diagram,  in  which  we  will 
be  concerned  only  with  the  varieible  A  ; 


I  n  A= 


_ I _ 

.  “A  •  •  •  I 


statement  n  generates  a  value  for  A  which  is  used  in 
statement  m  .  Each  time  a  value  is  generated  for  A  at 
statement  n  (each  time  control  flows  through  flowblock 

I  ),  that  value  may  be  used  at  statement  m  once,  or  mamy 
times,  or  not  at  all  (control  may  flow  from  I  to  II  to  III; 
or  it  may  flow  from  I  to  II  and  then  recirculate  through 

II  any  number  of  tiroes;  or  it  may  flow  from  I  to  III  and 
back  to  I  again) .  Or  we  might  complicate  the  picture 
slightly  so  that  there  are  two  alternative  statements, 
either  one  of  which  may  have  generated  the  value  used 

in  any  given  execution  of  statement  m  , 
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In  any  algorithm  which  contains  branching,  the  data 
dependencies  are  "variedile"  —  that  is,  they  are  determined 
by  the  particular  path  "chosen  by  control"  when  the 
algorithm  is  executed.  It  should  be  kept  in  mind,  further¬ 
more,  that  both  forward  and  backward  data  dependencies  are 
at  issue;  we  are  interested  not  only  in  when  the  appropriate 
value  for  an  operand  is  available  and  may  be  used  but  also 
in  when  a  value  is  no  longer  needed  and  a  new  value  may  be 
provided.  The  data  dependencies  in  an  algorithm  with 
branching  constitute  what  we  might  call  a  "variable  partial 
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ordering".  Clearly,  before  we  can  consider  the  problem 
of  representing  such  variable  partial  orderings,  we  will 
have  to  deal  with  the  problem  of  extracting  the  necessary 
data  dependency  information  from  the  algorithmic  language 
definition  of  an  algorithm. 

VI.  Variable-Names  and  Data  Dependency  Relations 

We  have  already  discussed  one  role  of  variable-irames;  a 
variable-name  represents  a  set  of  mutually  exclusive 
values;  at  any  time  during  performance  of  the  algorithm, 

(at  most)  one  of  these  values  will  hold.  We  have  also 
considered  another  role  of  variable-names;  different 
(and  possibly  unrelated)  uses  of  the  same  variable-name 
may  be  --  and  normally  will  be  —  mapped  to  the  same 
machine  location;  thus  the  various  uses  of  a  given 
variable-name  constitute  part-part  matching  information. 

(Of  which  we  may  or  may  not  wish  to  take  advantage  —  as 
we  have  already  pointed  out,  it  will  frequently  prove 
advantageous  to  map  different  uses  of  the  same  variable- 
name  to  different  machine  components  in  order  to  allow  these 
uses  to  be  concurrent.)  We  will  now  want  to  consider 
another  role  of  variable-names;  we  will  want  to  exaraine 
the  ways  in  which  variable-names  interact  with  control  to 
determine  data  dependencies. 
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1  Consider  the  following  example,  in  which,  again,  we  are 
interested  only  in  the  variable  A  . 


We  said  earlier  that  an  occurrence  of  a  variable-name  on 
the  right  side  of  an  assignment  statement  (i.e.,  a  use  of 
a  variable)  represents  the  result  of  the  most  recently 
executed  assignment  to  that  variable-name.  Whenever 
statement  n  or  statement  o  is  executed,  the  occurrences 
of  A  in  these  statements  must  necessarily  represent  the 
value  produced  by  the  most  recent  execution  of  statement  m 
Consequently  statements  n  and  o  are  ordered  with  „espect 
to  statement  m  :  statement  n  is  later  than  statement  m 


o«-  — Wf"' 


and  statement  o  is  later  than  statement  .  m  .  Because 


the  uses  of  A  in  statements  n  and  o  always  represent 
ithe  value  generated  by  the  most  recent  execution  of  state¬ 
ment  m  (i.e.f  m  is  the  only  assignment  statement  which 
can  produce  values  for  those  uses),  we  call  those  uses  of 
A  members  of  the  A  equivalence  class  generated  by 
statement  m  .  Similarly,  the  use  of  A  in  statement 
£  is  a  member  of  the  A  equivalence  class  generated  by 
statement  p  .  Let  us  now  expemd  the  example  as  follows. 


34. 


After  each  generation  of  the  A  equivalence  class 
generated  by  statement  £  ,  the  use  of  A  in  statement 
r  may  occur  once,  or  mcuiy  times,  or  not  at  all  before  the 
generation  of  some  other  A  equivalence  class.  However, 
the  use  of  A  in  statement  r  always  (if  and  whenever 
it  occur.'ij  represents  the  value  generated  by  the  most 
recent  execution  of  statement  £  ,  and  therefore  it  is 
a  member  of  the  A  equivalence  class  generated  by  state¬ 
ment  £  .  One  way,  then,  in  which  variable-names  and 
control  interact  to  determine  data  dependency  is  in  the 
generation  by  assignment  statements  of  equivalence  classes, 
which  define  ordering  relations  between  operations.  Each 
occurrence  of  a  variable-name  on  the  left  side  of  an 
assignment  statement  represents  the  generation  of  an 
equivalence  class;  each  use  of  that  variable-name  for  which 
that  assignment  statement  is  always  the  most  recent  assign¬ 
ment  to  that  varicdjle-name  is  a  member  of  that  equivalence 
class. 

Let  us  now  consider  an  exeunple  which  illustrates  another 
type  of  data  dependency. 


Here  there  are  two  possible  statements,  m  and  n  ,  which 
may  generate  a  value  for  the  use  of  A  in  statement  o  . 
These  uses  of  the  variable-name  A  express  both  a  part- 
part  matching  and  a  set  of  ordering  relations.  The  results 
of  two  alternative  computations  are  "merged";  whenever 
statement  o  is  executed,  it  uses  as  an  operand  the  most 
recent  result  of  either  statement  m  or  statement  n 
—  whichever  was  executed  most  recently.  In  the  context 
of  our  earlier  discussion,  we  can  describe  merges  as 
the  result  of  part-part  matching  —  either  the  "folding 
up"  of  one  history  or  the  "overlaying"  of  different 
histories  (or  both),  in  hardware  terms,  such  part-part 


matching,  represented  by  merges,  allows  the  mapping  of 
alternative  control  histories  onto  the  same  equipment. 
Statements  m  and  n  both  generate  A  equivalence 
classes.  But  since  the  use  of  A  in  statement  o 
represents  a  value  which  may  have  been  produced  by  either 
m  or  n  ,  it  cannot  be  a  member  of  either  equivalence 
class.  Because  we  will  want  the  A  equivalence  classes 
to  constitute  a  partition  of  all  uses  of  A  ,  we  will 
let  the  uses  of  A  in  statements  o  and  £  constitute 
a  third  equivalence  class.  Roughly  speaking,  we  can  think 
of  this  equivalence  class  as  having  been  generated  when 
either  statement  m  or  statement  n  has  been  executed, 
emd  a  decision  has  been  made  to  go  to  flowblock  III.  In 
the  flowblock  diagram  we  would  locate  the  generation  of 
the  equivalence  class  at  the  "merge-point"  —  that  is,  at 
the  entry  to  flowblock  III.  Note  that  whenever  statement 
o  is  executed,  this  merge-point  will  then  always  be  the 
most  recent  point  at  which  an  A  equivalence  class  was 
generated.  In  terms  of  the  data  dependencies,  a  set  of 
alternative  ordering  relations  has  been  defined:  either 
o  is  later  than  m,  or  o  is  later  than  n  .  We  have, 
then,  two  types  of  equivalence  classes,  which  represent 
ordering  relations  between  operations.  We  shall  want  to 
partition  the  uses  of  each  variable-name  into  equivalence 
classes.  To  accomplish  this  we  use  an  algorithm  by  Warshall, 
which  is  described  in  Appendix  II.  Here  we  shall  restrict 
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ourselves  to  a  very  brief  account  of  the  algorithm.  It 
might  be  useful  to  recast  the  problem  of  partitioning 
variable-uses  into  equivalence  classes  in  more  familiar 
terms.  A  common  optimization  problem  —  and  one  which 
is  normally  dealt  with  only  locally  —  is  the  elimination 
of  redundant  computation,  or  common  subexpression  elimina¬ 
tion.  Suppose,  for  example,  that  the  expression  SINF CA) 
appears  twice  in  an  algorithm.  We  would  like  to  know 
whether  we  can  compute  the  sine  of  A  once  for  both  uses. 

We  would  like  to  know,  roughly  speaking,  if  both  instances 
of  A  "always  represent  the  same  value."  More  p-:ecisely, 
is  there  a  point  £  in  the  flow  diagram  such  that  on  no 
path  from  £  to  either  of  the  uses  is  there  an  A-assign- 
ment  and  such  that  there  is  no  path  from  any  A-assignment 
to  either  of  the  uses  which  does  not  pass  through  £  ? 

(If  such  a  point  exists,  we  can  safely  place  the  computation 
of  the  sine  of  A  there.)  Based  on  our  definition  of  an 
equivalence  class,  we  can  restate  this  question  as  follows: 
Are  both  instances  of  A  members  of  the  same  equivalence 
class?  Warshall's  algorithm,  then,  may  be  thought  of  as 
a  global  solution  of  the  problem  of  common  subexpression 
elimination. 

As  an  example,  let  us  take  the  flow  diagram  below  and 
apply  Warshall's  algorithm  to  the  variable  A  .  Only  the 
occurrences  of  A  actually  appear  in  the  diagram,  and 


the  statements  in  which 'they  occur  have  been  nuBd>ered 
for  convenience. 


We  expand  this  graph  by  replacing  each  £lo«d>lock-node  F 
with  a  totally  ordered  set  of  nodes  4  ,  as  follows: 

-  if  F  is  the  entry  flowblock,  one  node  called 
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the  entry-node,  which  is  the  earliest  node  in 
$  ;  or 

if  F  has  more  than  one  input  arc,  one  node 
called  a  flow-node,  which  is  the  earliest  node 
in  $  ;  cind 

a  set  of  instance-nodes,  one  corresponding  to 
each  instance  of  A  in  flowblock  F  ;  these 
nodes  are  ordered  according  to  the  order  in 
which  the  corresponding  instances  of  A  occur 
within  the  flowblock  (where  the  left  side  of 
an  assignment  statement  is  later  than  the  right 
side) ;  and 

-  if  F  has  more  than  one  output  arc,  one  node 
called  a  decision-node,  which  is  the  latest 
node  in  $  ;  or 

-  if  F  is  the  exit  flov/block,  one  node  called 
the  exit-node,  which  is  the  latest  node  in  $  . 

All  input  arcs  of  F  become  input  arcs  of  the  earliest 
node  in  $  ;  all  output  arcs  of  F  become  output  arcs  of 
the  latest  node  in  $  .  If  $  is  empty  (i.e.,  F  has 
one  input  arc,  one  output  arc,  and  contains  no  instance  of 
the  variable) ,  then  it  must  have  some  unique  precedessor 
flowblock  G  and  some  unique  immediate  successor  flow- 
block  H  ;  replace  the  arc  from  G  to  F  and  the  arc 
from  F  to  H  with  one  arc  from  G  to  H  .  The  re¬ 
sulting  graph  for  the  excimple  above  is  the  following: 
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Note  that  there  is  still  a  unique  node  which  is  earlier 
than  every  other  node  in  the  graph  (the  entry-node)  and 
a  unique  node  which  is  later  than  every  other  node  in  the 
graph  (the  exit-node) .  The  purpose  of  the  algorithm  is  to 
subscript  the  instances  of  A  in  such  a  way  that  they 
are  partitioned  into  equivalence  classes.  This  means 
that  we  want  to  identify  a  minimal  set  of  nodes  in  the 


above  graph  as  equivalence-class -generators  such  that 
every  other  node  in  the  graph  has  a  unique  nost  recent 
equivalence-class-generating  ancestor.  We  know  that  all 
assignments  to  A  generate  equivalence  classes;  therefore 
we  will  circle  all  nodes  representing  left-side  instances 
of  A  and  label  them  uniquely  as  ,  A^2  •  •••  •  A^  . 

We  will  also  circle  the  entry  node  and  label  it  A^g  . 


It  remains  to  determine  the  equivalence  classes  which 
must  be  generated  because  of  merges.  Roughly  speaking, 
the  algorithm  accomplishes  this  by  pushing  the  name  of 
each  circled  node  along  directed  arcs  to  all  uncircled 
nodes  which  can  be  reached  without  encountering  another 
circled  node.  When  two  different  names  meet  on  an 
uncircled  node,  that  node  is  circled;  such  newly  circled 
nodes  are  uniquely  labeled  as  •  •  •  •  »  ^fm  • 

The  names  of  the  newly  circled  nodes  are  also  propagated 
until  no  more  nodes  may  be  circled.  (It  should  be 
intuitively  clear  that  only  flow-nodes  are  candidates 
for  circling).  Upon  completion  of  the  algorithm  every 
node  is  either  circled  or  has  associated  with  itself  the 
name  of  exactly  one  circled  node:  namely,  its  unique  most 
recent  circled  ancestor.  The  nemes  define  the  partition 
into  equivalence  classes.  Each  circled  node  corresponds 
to  an  equivalence-class-generating  event  —  either  an 
assignment  to  A  or  a  merge. 
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1  Let  us  make  one  further  modification  in  the  above  graph, 
as  follows.  For  each  circled  flow-node,  which  defines 
some  equivalence  class  ,  consider  the  names  associated 

with  those  nodes  v:hich  are  its  immediate  predecessors. 

For  each  such  name  Aj^j  (where  k  represents  either  £ 
or  c  )  —  and  there  must  be  at  least  two  different  names 
—  introduce  a  new  uncircled  node  into  the  graph,  and 
assign  it  the  name  .  Introduce  new  arcs  such  that 

this  new  node  is  the  immediate  successor  of  each  Aj^j 
node  which  was  an  immediate  predecessor  of  the  circled 
A£^  node.  Eliminate  the  arcs  from  these  immediate  pre¬ 
decessors  to  the  circled  A^^  node,  and  introduce  a  new 
arc  from  the  new  A^^  node  to  the  circled  A^^  node. 
Application  of  this  procedure  to  the  example  above  produces 
the  following  graph; 


1 


exit»Aj2 


We  call  this  graph  the  complete  p-qraph  of  A  .  Let  us 
provide  several  further  definitions  for  future  use.  We 
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call  the  set  of  all  nodes  in  the  ccxnplete  p-graph  of  A 
which  have  associated  with  them  the  name  the  menJaers 

of  the  equivalence  class  Aj^j^  .  We  call  the  set  of  all 

nodes  which  are  immediate  successors  of  members  of  some 
equivalence  class  A^^^^  but  which  are  not  themselves 
members  of  the  equivalence  class  Aj^^^  the  exit  nodes  of  Aj^ 

We  define  the  graph  of  the  equivalence  class  Aj^^  as  the 

siabgraph  of  the  complete  p-graph  of  A  which  contains  the 
members  of  Aj^j|^  together  with  the  exit  nodes  of  Aj^j^  . 

Thus  the  graph  of  the  equivalence  class  in  our  example 

would  be  the  following: 


VII.  The  Translation  of  Conventional  Algorithms 
into  Cyclic  Partial  Orderings 


Let  us  assiune  that  we  have  applied  Harshall's  algorithm 
to  each  variable  in  some  algorithm.  We  cem  now  consider 
the  problem  of  giving  this  data  dependency  information 
explicit  representation  and  of  relating  it  to  decisions. 
Let  us  continue  to  represent  operations  as  before. 

initiation 


completion 

For  each  assignment  in  an  algorithm  we  will  produce  one 
such  operation  representation.  We  can  represent  decisions 
(i.e.,  ^  statements)  similarly,  except  that  we  will 
represent  the  various  possible  outcomes  or  decision- 
resolutions  explicitly  as  net  conflict. 
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A  decision  has  a  variable  as  an  operand  just  like  any 
other  operation.  (Note  that  for  each  decision  in  an 
algorithm  there  will  be  in  each  p-graoh  of  some  varied^le 
in  the  algorithm  a  unique  decision-node  corresponding  to 
that  decision.  Similarly »  for  each  decision-resolution 
there  will  be  a  unique  arc  —  one  of  the  output  arcs  of 
the  decision-node  —  corresponding  to  that  decision- 
resolution.)  Since  we  are  aiming  at  a  representation 
which  explicitly  exhibits  data  dependencies  and  since 
these  data  dependencies  are  determined  by  the  interaction 
of  control  with  variable-names,  we  will  want,  roughly 
speaking,  to  link  decision  results  directly  to  variable- 
uses  to  generate  ordering  relations  between  operations. 
Therefore  we  will  expeuid  our  previous  representation  of 
a  variable- use  from: 


generation 
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Before  giving  formal  rules  for  applying  this  schema, 
let  us  describe  it  in  informal,  approximate  terms. 

Aj^i  has  the  Scime  interpretation  as  in  our  previous 
schema;  the  current  value  of  the  variable  is  available 
for  this  use;  it  may  not  yet  be  changed  to  the  next 
value.  Similarly,  (u)  means:  the  current  value 

has  already  been  used  and  is  no  longer  needed;  it  may  be 
changed  to  the  next  value,  represents  a  sort 

of  limbo;  the  current  value  is  available,  but  this  use 
of  it  may  or  may  not  take  place  (before  the  generation  of 
its  next  value) ,  depending  on  the  outcome  of  one  or  more 
decisions.  Places  £  »  d  ,  and  e  are  connected  to 
transitions  representing  decision-resolutions.  A  decision- 
resolution  which  causes  this  use  of  A  to  take  place 
(before  the  next  generation  of  the  equivalence  class)  has 
e  as  an  output  (and  is  called  "an  enabling  event"  for  this 
variable-use)  and  c  as  an  input.  A  decision-resolution 
which  guarantees  that  this  use  of  A  will  not  take  place 
(before  the  next  generation  of  the  equivalence  class)  .xas 
d  as  an  output  (and  is  c<illed  a  "disabling  event"  for  this 
variable-use)  and  c  as  an  input,  c  means:  the  last 
decision  result  affecting  this  variable-use  has  "taken 
effect";  the  next  relevant  decision  result  may  be  generated. 


The  transition  labeled  "use  of  Aj^^^  "  represents  the 

initiation  of  the  operation  in  which  this  instance  of  A 
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is  an  opercuid.  If  the  equivalence  class  is 

generated  by  an  assignment,  then  the  transition  labeled 
"generation  of  “  represents  the  completion  of  the 

operation  which  provides  values  for  that  equivalence 
class.  All  representations  of  variable-uses  which  are 
members  of  the  same  equivalence  class  will,  of  course, 
share  the  same  generating  tr2uisition  and  have  different 
use  transitions.  If  the  equivalence  class  Aj^^^  is 
generated  by  a  merge  of  several  A  equivalence  classes, 
we  create  a  set  of  alternative  generating  events  —  one 
for  each  equivalence  class  which  participates  in  the 
merge.  Each  such  generating  event  will  consist  of  one 
transition  which  has  as  an  "operand"  a  variable-use 
representation  which  is  a  member  of  one  of  the  merging 
equivalence  classes.  Each  of  these  alternative 
generating  trcuisitions  will,  of  course,  generate  the 
entire  equivalence  class.  For  example: 


generation 


generation 


The  2u:cs  a  and  b  are  alternatives  —  exactly  one  of 
the  two  is  present  in  any  given  representation  of  a 
variable-use.  If  each  time  the  equivalence  class  is 

generated,  this  use  must  take  place  at  least  once,  then 
arc  a  is  present  and  not  arc  b  .  If,  after  each 
generation  of  the  equivalence  class  ,  this  use  may 

or  may  not  take  place,  then  arc  b  is  present  and  not  arc 
a  .  Similarly,  arcs  ^  and  ^  are  alternatives.  If  for 
each  generation  of  the  equivalence  class  this  use  may  take 
place  at  most  once,  then  arc  f  is  present  and  not  arc  ^  . 
If  for  each  generation  of  the  equivalence  class  this  use  may 
take  place  more  than  once,  then  arc  ^  is  present  and  not 
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arc  f  . 


Let  us  now  restate  these  rules  more  precisely.  For 
each  uncircled  node  in  the  complete  p-graph  of  A  which 
is  not  a  flow-node  or  a  decision-node,  we  will  produce 
a  variable-use  representation  in  accordance  with  the 
schema  above  and  the  following  rules.  Consider  any  such 
node  ,  which  is  a  member  of  some  equivalence  class 
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If  the  equivalence  class  Aj^^  is  generated  by 
an  assignment,  then  the  generating  transition 
of  Aj^j^  is  the  termination  transition  of 
the  operation  corresponding  to  the  generating 
node  of  Aj^^  . 

If  the  equivalence  class  Aj^^^  is  generated  by 
a  merge,  then  there  is  a  set  of  alternative 
generating  transitions  for  Aj^^  —  one 
corresponding  to  each  immediate  predecessor 
node  of  the  circled  Aj^j^  node. 

If  the  node  (u)  does  not  have  as  an 

immediate  successor  a  circled  flow-node,  then 
its  use  transition  is  the  initiation  transition 
of  the  operation  associated  with  it. 

If  the  node  (u)  as  an  immediate 

successor  a  circled  flow -node,  then  its  use 
transition  is  one  of  the  set  of  alternative 
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generating  transitions  for  the  equivalence 
class  defined  by  the  circled  flow-node. 

If  every  path  from  the  circled  Ay.^  node  to 
an  exit-node  of  contains  ,  then 

the  representation  of  contains  arc  a 

and  not  arc  b  . 

Otherwise  it  contains  arc  b  and  not  arc  a  . 

If  in  the  graph  of  A^^^  there  exists  a  circuit 
such  that  all  nodes  in  the  circuit  are  members 
of  Aj^j^  and  such  that  Aj^j^  is  contained  in 
the  circuit,  then  the  representation  of  Aj^i(u) 
contains  arc  £  and  not  arc  f  . 

Otherwise  it  contains  arc  f  and  not  arc  g.  • 

For  each  decision-^node  Aj^i(y)^Aki  »  such  that 
there  exists  a  path  from  to 

which  is  contained  in  Aj^^^  ,  and  such  that  there 
exists  at  least  one  path  from  Aj^i(y)  to  some 
exit-node  of  Aj^^  which  does  not  contain  Aj^j^  : 
Let  P  be  the  set  of  all  paths  p  such  that  the 
first  node  of  p  is  A]^i(y)  the  last  node 

of  p  is  an  exit  node  of  Aj^j^  and  such  that  the 
last  node  of  p  is  the  only  node  in  p  which  is 
not  a  member  of  Aj^j^  .  Partition  P  into  sub¬ 
sets  P]^  ,  P2  ,  . . .  ,  Pjj  according  to  the  second 
node  of  each  member  path  (i.e.,  according  to  the 
branch  taken  at  the  decision) ,  so  that  each 


subset  corresponds  uniquely  to  a  resolution  of 
the  decision. 

-  For  each  subset  Pj^  such  that  all  members 

of  Pj^  contain  ,  let  the  decision- 

resolution  treuisition  corresponding  to  Pj^ 
have  as  an  output  place  e  (in  the  re¬ 
presentation  of  and  as  an  input 

place  c  . 

-  For  each  subset  Pj  such  that  no  member 

of  Pj  contains  ,  let  the  decision- 

resolution  treuisition  associated  with  ?j 
have  place  d  as  em  output  and  place  c 
as  an  input. 


Constants  are  treated  similarly.  For  each  use  of  a  constant, 
we  produce  a  representation  in  accordance  with  the  schema  and 
rules  for  variable-uses.  However,  since  there  can  be  no 
generation  event  for  a  constant,  part  of  the  schema  will  be 
superfluous,  (as  indicated  in  the  following  figure  by 
broken  lines) . 


k*  *0- - , 


I 


Furthermore,  if  place  e  is  not  an  output  of  any  tran¬ 
sition  —  i.e.,  if  the  constant-use  in  question  occxirs  in 
every  control  history  of  the  algorithm  —  then  we  will 
eliminate  places  e  and  c  from  the  representation  as 
well,  so  that  the  value  is  made  available  again  after 
each  use,  independently  of  any  other  computation  or 
decision.  This  would  leave  us  with  the  following  schema: 


These  representational  schemata  for  varieUsle-uses  {and 
constant-uses)  and  decisions  differ  radically  from  con¬ 
ventional  representations.  A  decision  is  no  longer  viewed 
as  a  point  in  a  flow  diagram  at  which  control  chooses  one 
of  several  alternative  paths  emd  a  decision-resolution 
singly  as  the  choice  of  one  of  those  alternative  com¬ 
putational  paths.  Instead  each  decision-resolution  has  a 
set  of  results.  Each  of  the  results  affects  the  status  of 
some  one  varieUsle-use,  either  enabling  or  disabling  it 
—  i.e.,  each  decision-resul -  determines  either  a  forward 
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or  a  backward  data-dependency  relation.  One  important 
aspect  of  this  is  that  the  various  effects  of  a  decision- 
resolution  are  given  individual,  explicit  representation. 
Even  more  interesting,  however,  is  the  fact  that  this 
schema  is  free  of  the  dualism  of  conventional  representa¬ 
tions:  control  and  computation  no  longer  have  different 
ontological  status;  decision  results  and  computational 
results  alike  are  explicitly  represented  as  conditions 
(or  "sub-states"  or  "signals")  in  a  partially  ordered, 
cyclic  system  of  such  conditions. 

Having  explained  our  representational  schemata  in  detail, 
we  will  now  replace  them  with  more  concise  notational 
forms.  We  shall  replace  the  operation  schema  with  a 
double  bar. 


To  model  decisions,  we  shall  break  the  lower  bar  to 
represent  the  various  possible  decision-resolutions. 
Furthermore,  we  shall  name  each  decision-resolution  in 
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i±e  algorithm  tiniquely. 


We  shall  replace  the  variable-use  model  with  a  rectangle; 
an  output  etrc  will  connect  the  generating  transition  to 
the  variable-use  model;  em  output  arc  will  connect  the 
variable-use  model  with  its  use  transition.  A  diagonal 
in  the  upper  right  corner  of  the  rectangle  indicates  the 
existence  of  eurc  a  ;  its  absence  indicates  the  existence 
of  arc  b  .  A  diagonal  in  the  lower  left  corner  of  the 
rectangle  Indicates  the  existence  of  arc  f  ;  its  absence 
indicates  the  existence  of  arc  £  .  The  names  of  all 
enabling  events  of  the  variable-use  (i.e.,  inputs  of  place 
e  )  are  listed  along  the  right  edge  of  the  rectangle.  The 
names  of  all  disabling  events  (i.e.r  inputs  of  d  )  are 
listed  along  the  left  edge. 


i 
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Before  we  can  apply  our  representational  procedures  to 
an  example/  we  must  (for  the  time  being  at  least)  Impose 
one  further  restriction:  all  decisions  must  be  ordered. 
This  is  easily  accomplished  since  every  decision- 
resolution  involves  a  commitment  to  a  unique  next  decision. 
Therefore,  for  each  decision  in  the  algor itlim,  we 
create  a  place  which  is  input  to  the  initiation  transition 
of  that  decision;  we  can  then  make  this  place  an  output  of 
every  decision-resolution  transition  which  has  as  its 
immediate  successor  decision. 

VIII.  An  Example  of  the  Translation  Procedure 

Let  us  now  take  the  following  algorithm  segment  as  an 
example  for  translation  into  our  representational  form. 

For  the  sake  of  convenience  and  clarity  we  will  number 
each  statement  and  each  decision-resolution. 
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We  now  have  a  representation  which  expresses  roost  of 
the  obvious  kinds  of  concurrency  possible  for  the 
algorithm.  It  consists  of  a  partial  ordering  of 
operations  determined  exclusively  by  the  data  dependencies 
(with  the  exception  of  the  ordering  of  decisions).  Con¬ 
trol  has  been  largely  eliminated.  Each  decision  inter¬ 
acts  explicitly  with  each  variable-use  it  affects. 


Pipelininc 


There  is,  in  this  representation,  a  certain  amount  of 
"play"  between  decisions  and  value  availability:  a 
value  may  be  available  for  use  in  an  operation  before 
a  commitment  has  been  made  to  perform  that  operation; 
conversely,  the  commitment  to  perform  an  operation  may 
be  made  before  a  necessary  operand  value  has  been 
generated.  Because  of  the  fact  that  algorithms  contain 
cycles,  it  will  be  to  our  advantage  to  increase  this 
freedom  as  much  as  possible.  For  example,  we  might 
consider  a  loop  in  which  the  control  variable  is  computed 
independently  from  the  other  operations  in  the  loop.  If 
we  could  get  several  iterations  ahead  with  the  decisions, 
we  could  "wind  up"  the  loop  and  achieve  a  pipeline  effect. 
That  is,  ideally  it  might  be  possible  to  have  all  the 
operations  in  progress  concurrently  so  that  the  through¬ 
put  rate  for  the  loop  would  be  determined  by  the  time 


required  for  the  longest  individual  operation.  To  allow 
this  kind  of  concurrency  we  will  introduce  a  simple  net 
structure  which  might  be  variously  interpreted  as  a 
buffer,  a  stack,  a  queue,  or  a  pipeline. 


We  have  already  used  this  structure  to  Illustrate  pipe¬ 
lining.  We  might  also  interpret  each  pair  of  places  as 
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representing  a  location  in  %^ich  a  value  or  a  signal 
may  be  stored.  If  the  left  place  is  full,  that  location 
is  empty  and  may  receive  a  value.  If  the  right  place  is 
full,  that  location  contains  a  value,  which  it  may  trans¬ 
mit  to  the  next  location  (it  is  not  possible  for  both  places 
in  a  pair  to  be  full  —  cr  empty).  We  could  then  view  this 
structure  as  a  first-in-first-out  stack.  Signals  eure 
dropped  in  at  the  top  and  taken  out  at  the  bottom  in  order; 
the  stack  may  hold  as  many  values  concurrently  as  it  has 
place-pairs.  Let  us  now  suppose  that  there  are  two  kinds 
of  signals  which  may  be  placed  into  a  stack  2md  that  we 
would  like  to  distinguish  between  them  explicitly.  Further¬ 
more,  we  will  want  to  preserve  the  order  in  which  they 
enter  the  stack.  We  can  represent  such  a  bi-valued  stack 
as  follows. 


In  this  fashion,  we  can  create  an  arbitrarily  high 
degree  of  freedom  between  decisions  eUid  computations. 
Furthermore,  since  the  decision  results  affecting  each 
variable  use  are  given  individual  representation,  this 
means  that  we  may  thereby  increase  the  freedom  between 
different  computations. 

X.  Control  and  Merges 


We  have  considerably  increased  the  power  of  our  notation 
to  represent  potential  concurrency,  but  our  representation 
still  contains  arbitrary  sequencing  restrictions.  The 
most  obvious  and  serious  of  these  is  the  ordering  of 
decisions.  Let  us  briefly  consider  two  important 
implications  of  this  restriction.  First,  we  would  like 
to  be  able  to  pipeline  the  algorithm  as  a  whole  so  that 
it  may  concurrently  process  more  than  one  set  of  inputs. 

As  long  as  decisions  are  totally  ordered,  no  significant 
amount  of  pipelining  will  be  possible,  since  all  decisions 
involved  in  the  processing  of  one  input  set  must  clearly 
have  been  made  before  any  decision  involved  in  processing 
the  next  input  set  may  be  made.  Secondly,  let  us  consider 
the  fallowing  example. 
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is  excluded. 

On  the  other  hand,  we  cannot  simply  throw  out  the 
ordering  of  decisions  altogether.  To  show  why  this 
straightforward  solution  is  inadequate,  we  will  try 
abandoning  the  ordering  of  decisions  in  the  following 
example,  in  which  we  will  be  specifically  concerned  with 
the  merge  of  the  variable  A  at  flowblock  IV.  We  have 
named  the  decisions  in  this  diagram  a,  b,  and  c,  and 
we  have  named  the  decision-resolutions  i,  ii,  iii,  iv, 
V,  vi,  and  vii  . 


If  we  assume  appropriate  data-dependencies ,  the  following 
set  of  events  is  possible:  "Control"  enters  flowblock  I 
cuid  at  decision  a  it  chooses  resolution  ii  .  Let  us 
assxime  that  decision  b  is  extremely  time-consuming  cUid 
that  while  this  decision  is  being  made,  "control"  (or 
"part  of  control",  perhaps)  skips  ahead  to  flowblock  VI 
eind  re-executes  decision  a  —  this  is,  of  course, 
permissible  because  decision  a  must  be  encountered 
again  regardless  of  the  outcome  of  decision  b  .  Let  us 
suppose  that  this  time  resolution  i  is  chosen,  and 
control  enters  flowblock  III,  where  a  value  is  generated 
for  .  Decision  c  is  executed  enabling  to 

enter  the  merge  euid  provide  a  value  for  Aj^  .  At  this 
point  decision  b  is  finally  completed;  resolution  iv 
is  chosen  enabling  A^^^  to  provide  a  value  for  Aj^  at 
the  merge.  Since  the  two  "entries"  into  the  merge 
occurred  in  the  wrong  order,  however,  any  computations 
which  use  A^^  will  have  been  rendered  meaningless. 

Roughly  speaking,  wherever  there  is  a  merge  (i.e.,  part- 
part  matching) ,  we  must  keep  track  of  the  logical  priorities 
of  the  various  claims  which  may  be  made  on  a  representational 
"part".  The  different  uses  of  such  a  "part"  can  only  be 
distinguished  by  the  order  of  their  occurrence.  Hence  we 
will  want  to  determine  which  decisions  are  critical  in 
maintaining  priorities  among  "entries"  into  a  merge.  If  we 
then  order  the  effects  of  these  decisions  on  the  merge,  we 


Ccin  allow  the  decisions  themselves  to  take  place  in  any 
order. 


We  can  briefly  outline  a  procedure  for  identifying  the 
set  of  decisions  which  are  critical  for  a  given  variable- 
merge.  Take  the  p-graph  of  the  variable  in  question  and 
delete  all  circles.  Circle  the  exit  node  and  each  of 
the  immediate  predecessor  nodes  to  the  merge  node  in 
question.  Reverse  the  direction  of  every  arc  in  the 
p-graph  and  apply  Warshall's  algorithm.  This  will  cause 
the  desired  set  of  decision  nodes  to  be  circled  (and 
only  those  nodes).  We  can  then  use  this  information  to 
order  entries  into  the  merge.  In  the  example  above, 
decisions  a  ,  b  ,  and  c  constitute  the  set  of 
interesting  decisions  for  the  merge  into  .  We  can 

order  the  effects  of  those  decisions  as  follows:  (Note 
that  the  order  in  which  the  decisions  themselves  take 
place  is  not  affected) . 


The  desequencing  of  decisions  may  also  lead  to  a  similar 
problem  with  certain  variable-uses,  emd  a  solution  like 
that  for  merges  is  applicable. 


77. 

XI .  Proposed  Extensions  of  the 
Representational  Form 

We  have  outlined  procedures  which  make  possible  the 
translation  of  a  sequentially  defined  algorithm  into  a 
powerful  representation  of  highly  concurrent  execution 
of  the  algorithm.  Roughly  speaking,  each  operation  may 
take  place  when  (1)  the  necessary  operand  values  are 
available,  (2)  enough  decisions  have  been  made  to 
guarantee  that  the  operation  will  be  required,  and 
(3)  enough  decisions  have  been  made  to  guarantee  that 
no  logically  prior  claim  can  be  made  on  the  algorithmic 
parts  involved.  All  sequencing  has  been  stripped  out 
except  that  which  is  given  by  data  dependencies  or  by 
priorities  for  part  use.  In  the  process,  control  has 
been  dismembered  and  the  useful  information  which  it 
carries  has  been  broken  down  into  individual  ordering 
relations. 

This  is  as  far  as  we  can  carry  the  development  of  this 
representational  apparatus  in  this  discussion,  but  we 
would  like  to  mention  several  possible  extensions  and 
applications.  For  example,  we  have  already  mentioned 
the  fact  that  one  arbitrary  restriction  imposed  by  the 
notion  of  control  is  that  nothing  may  be  executed  which 
is  not  computationally  necessary.  However,  it  may  prove 
more  efficient  to  defer  some  decisions  —  to  pursue  one 
or  more  alternative  branches  provisionally  before  the 


choice  among  them  has  been  made.  We  now  have  a  represen¬ 
tation  which  exhibits  explicitly  which  varied)le-uses  are 
affected  by  a  given  decision.  Therefore,  we  could 
mechanically  build  decision-deferral  into  our  representation 
by  moving  enedsle/disable  connections  to  other  variable 
uses  which  are  later  in  the  chain  of  data-dependencies 
—  so  long  as  we  provide  logical  machinery  to  discard 
rejected  values.  Where  two  such  alternative  paths 
merged,  furthermore,  we  could  extend  the  decision  deferral 
by  "unzipping"  the  merge  —  that  is,  by  duplicating 
representational  structures  logically  later  than  the  merge. 

We  might  use  the  technique  of  duplication  in  another  con¬ 
text  as  well;  if  we  could  identify  computational  bottle¬ 
necks,  we  might  very  profitably  duplicate  the  structures 
at  these  bottlenecks.  If  we  had  statistical  information 
about  the  relative  frequency  of  different  entry  paths  into 
a  given  merge,  we  might  also  implement  another  type  of 
decision  deferral:  we  could  "open"  the  most  probable  entry 
to  the  merge  on  a  provisional  basis,  even  though  the 
necessary  decisions  to  determine  priority  of  entry  had  not 
yet  been  made.  Again,  we  would  need  logical  machinery  for 
discarding  unwauited  values.  Several  of  the  above  possibilities 
involve  duplication  —  i.e.,  part-part  matching  in  reverse. 
Because  the  data  dependencies  eu:e  exhibited  explicitly  we 
Ceui  also  move  in  the  opposite  direction.  We  have  already 
discussed  one  kind  of  part-part  matching  which  is  a  stcuidard 
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optimization  technique:  elimination  of  redundemt 
computations.  We  have  accessible  the  information 
necessary  for  a  global  attack  on  this  problem.  Where  two 
similar  operations  have  operands  generated  by  the  same 
transitions  (i.e.,  where  the  operands  are  members  of  the 
same  equivalence  classes),  we  can  combine  them.  That 
is,  we  can  replace  the  two  operations  with  one  operation 
which  generates  an  equivalence  class  representing  the 
union  of  the  two  equivalence  classes  generated  by  the 
replaced  operations. 

X3:i.  Implications  for  Hardware  Design 

Finally,  we  would  like  to  make  several  remarks  about 
machine  design.  As  the  theoretical  limits  on  the  speed 
of  computing  components  are  approached,  further  increases 
in  computing  rates  depend  increasingly  on  our  ability  to 
build  and  use  machines  with  highly  parallel  operating 
capabilities.  Leaving  aside  the  question  of  cost  (which 
in  any  case  can  only  be  evaluated  when  we  have  the  means 
to  determine  how  effectively  such  equipment  could  be 
exploited) ,  the  principal  problem  in  designing  such 
computing  equipment  is  not  one  of  devising  suitable 
physical  components.  The  principal  problem  is  rather  the 
organization  of  physical  components  into  a  programmable 
system.  Even  the  most  straightforward  digital  computer  is 
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highly  parallel  in  its  operation  in  one  sense  —  its 
operation  represents  a  very  complex  system  of  partially 
ordered  events.  It  is  simply  that  this  system  has  been 
constructed  in  such  a  way  that  the  subset  of  events 
interesting  to  us  as  users  of  the  machine  will  occur 
sequentially  (or  very  nearly  —  even  on  the  "programmable" 
level  of  machine  behavior  we  can  cope  with  a  limited 
amount  of  concurrency) .  Digital  computers  are  designed 
in  this  way  so  that  sequentially  defined  algorithms  may 
be  mapped  onto  them.  It  is  because  of  this  that  they  are 
programmable.  Consequently  any  significant  reorganization 
of  hardware  to  exploit  more  fully  the  possibilities  of 
concurrent  operation  must  depend  upon  an  appropriate 
conceptual  reorganization  of  the  representations  of 
mathematical  processes  which  we  wish  to  perform. 


APPENDIX  I 


Petri  Nets  * 


^rmally,  a  Petri  net  is  a  directed  graph  with  two  kinds 
of  nodes:  places,  represented  as  circles;  and  transitions, 
represented  as  line  segments.  Each  directed  arc,  represented 
as  an  arrow,  connects  one  place  with  one  transition.  An 
arrow  from  a  place  to  a  transition  means  that  the  place  is 
an  input  to  the  transition;  an  arrow  from  a  transition  to 
a  place  means  that  the  place  is  an  output  of  the  transition. 
Every  place  in  a  net  is  an  output  of  at  least  one  transition 
and  an  input  to  at  least  one  transition.  No  place  may  be 
both  an  input  to  and  an  output  of  the  same  transition. 

A  place  is  capable  of  two  states:  full  or  empty .  The 
state  of  a  net  is  given  by  a  list  of  all  its  full  places. 

A  transition  may  fire  if  and  only  if  all  of  its  inputs  are 
full.  When  a  transition  fires,  all  of  its  inputs  are 
emptied  and  all  of  its  outputs  are  filled.  If  some  place 
is  input  to  two  or  more  transitions,  all  of  whose  inputs 
are  full,  these  transitions  are  in  conflict.  Only  one  of 
the  transitions  —  any  one  —  may  fire  in  such  a  situation. 
(See  Figures  A,  B,  and  C  for  examples  of  net  diagrams. 

Figure  B  shows  a  net  with  conflict. ) 

*For  a  comprehensive  account  of  Petri  nets  we 
refer  the  reader  to  the  "Final  Report  for  the  Information 
System  Theory  Project",  RADC  Contract  #  AF  30  (602) -4211, 
by  Dr.  Anato]  W.  Holt  et  al. 


A  net  and  an  occurrence-graph  representing  its  behavior 
The  shaded  places  eure  full.  The  broken  lines  represent 
time  slices  of  the  o-graph. 


Figure  B 


A  net  with  conflict  and  the  o-cycles  which  constitute  its 
basis.  When  A,  B,  and  C  are  full,  either  transition  1  fires 
or  transition  2  fires,  but  not  both. 
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Figure  C 


IL  :  Ball  1  is  moving 
counter-clockwise . 

IR  :  Ball  1  is  moving 
cloclcwise. 

2L  :  Ball  2  is  moving 
counter-clockwise. 

etc. 


In  using  Petri  nets  to  describe  a  system,  each  place  is 

\ 

associated  with  a  proposition  about  the  system.  By  \ 
interpretation,  when  a  place  is  full,  the  proposition 
associated  with  it  is  true.  In  other  words,  the  condition 

\ 

described  by  a  proposition  holds  in  the  system  when  the 

\ 

associated  place  is  full.  The  state  of  a  system  described 
by  a  given  state  of  its  net  is  the  conjunction  of  the 
propositions  associated  with  the  full  places. ‘  Thus  a 
net  diagram  togetl  with  a  suitable  initial  assignment 


*It  is  perhaps  misleading  to  speak  of  "system  states” 
here  since  a  net  does  not  necessarily  define  a  totally 
ordered  sequence  of  states.  (Formally,  this  is  because 
some  transitions  may  fir(»  con^rrently  -  that  is,  their 
firings  are  not  temporally  orclered.)  In  this  respect, 
nets  differ  fundamentally  from  state  machines. 


of  place  states  (corresponding  to  the  conditions  which 
hold  in  the  system  initially)  makes  possible  a  formal 
simulation  of  the  behavior  of  the  corresponding  system. 

Note  that  it  is  the  occupancy  of  places  which  is  viewed 
as  having  duration.  Transitions  merely  bound  places; 
the  firing  of  a  transition  is  not  viewed  as  time-consuming 
—  rather,  it  is  a  separation  of  distinct  place  occupancies. 
Hence,  the  propositions  associated  with  places  describe 
conditions  involving  time-consuming  operations  or  states. 

Figure  C,  for  example,  is  a  net  representation  of  four 
balls  moving  and  colliding  on  a  single-lane  circular  track. 

The  propositions  describing  the  system  are  all  of  the 
form:  "ball  n  is  moving  clockwise  (or  counter-clockwise)". 

We  may  view  an  occurrence-graph ,  or  o- graph,  as  a  directed 
graph  which  represents  a  simulation  history  of  some  net. 
Formally,  an  o-graph  consists  of  vertices ,  arcs,  and 
labels  associated  with  the  arcs.  Each  label  corresponds 
to  some  condition  of  the  system  being  represented.  (The 
words  label  and  condition  are  therefore  used  interchangeably 
in  this  context. )  Each  arc  represents  an  interval  of 
place  occupancy  (or  condition  holding) ;  the  place  (and 
hence  the  condition)  is  designated  by  the  label  associated 
with  the  arc.  An  inner  vertex  represents  a  transition 
firing  and  hence  an  occurrence  in  the  system  being  represented. 
(The  terms  inner  vertex  and  occurrence  are  accordingly 
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used  interchangeably.)  Thus  an  occurrence  may  be  described 
as  follows:  the  conditions  of  the  input  arcs  cease  to 
hold  (the  input  places  become  empty);  the  conditions  of 
the  output  arcs  begin  to  hold  (the  output  places  become 
full).  (See  Figures  A,  B,  and  D  for  examples  of  o-graphs.) 

Two  occurrences  are  said  to  be  temporally  ordered  if  and 
only  if  there  is  a  path  from  one  to  the  other;  the  former 
precedes  the  latter.  Note  that  some  occurrence  pairs  in 
an  o-graph  are  temporally  ordered  while  others  are  not. 
Occurrences  which  are  not  ordered  are  said  to  be  con¬ 
current.  Similarly,  two  arcs  are  temporally  ordered  if 
and  only  if  there  is  a  path  from  one  to  the  other;  arcs 
which  are  not  temporally  ordered  are  concurrent.  A 
time-slice  is  a  maximal  set  of  pairwise  concurrent  arcs. 

A  time-slice  represents  a  possible  state  of  the  net  (and 
hence  of  the  system)  during  the  history  which  the  o-graph 
describes.  (See  Figure  A.) 


j  (two  balls  moving  clockwise  and  two  counter-clockwise) 

I 

I 


(three  balls  moving  counter-clockwise  and  one  clockwise) 


IL _ ►  •  _ IR _ ►  .  IL 


(three  balls  moving  clockwise  and  one  counter-clockwise) 
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An  o-graph  may  be  decomposed  at  a  time-slice.  Two  o-graphs 
may  be  composed  if  the  terminal  conditions  of  one  are 
identical  to  the  initial  conditions  of  the  other.  An 
o-graph  whose  initial  and  terminal  conditions  are 
identical  is  termed  an  o- cycle.  An  o-graph  formed  by 
composing  some  number  of  copies  of  an  o-cycle  is  termed 
a  repetition  stretch  of  the  o-cycle.  An  o-cycle  which 
cannot  be  decomposed  into  further  o- cycles  is  termed  an 
irreducible  o-cycle.  (The  o-graphs  shown  in  Figures  A, 

B,  and  D  are  all  irreducible  o-cycles.)  For  every  net 
together  with  a  suitcible  assignment  of  place  states, 
there  is  at  least  one  basis,  consisting  of  a  finite  set 
of  irreducible  o-cycles  from  which  every  possible 
simulation  history  may  be  generated  by  composition  and 
decomposition.  If  the  net  contains  no  conflict,  its  basis 
consists  of  one  irreducible  o-cycle.  Note  that  a  given 
net  diagram  may  be  capable  of  several  different  disjoint 
behaviors  given  different  initial  place  assignments. 

Figure  D,  for  example,  shows  the  bases  for  the  three 
different  behaviors  of  which  the  net  in  Figure  C  is 
capcible. 
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APPENDIX  II 
Warshall’s  Algorithm 

We  start  with  a 

definition:  a  p-qraph  is  an  ordered  pair  (G,N)  where 

G  is  a  finite,  directed,  single- source,  single-sink  graph, 
and  N  is  any  stibset  of  the  nodes  of  G  which  includes 
the  source.  For  our  purpose  we  may  regard  G  as  the 
flow  graph  of  an  algorithm  where  the  unique  entry  and  exit 
are  the  source  and  sink  of  the  graph.  N  is  precisely  the 
set  of  initially  circled  nodes. 

definition:  a  p-graph  is  complete  if,  for  any  node  n  of 

G  either: 

(i)  neN  ;  or 

(ii)  there  exists  a  node  n*eN  such  that  any  path  from 
any  node  in  N  to  n  includes  n*  . 

In  terms  of  flow  graphs,  a  graph  is  complete  If  every 
node  has  a  unique  circled  ancestor,  i.e.,  every  use  of  a 
variable  belongs  to  a  unique  equivalence  class. 

We  now  see  that  a  solution  to  the  naming  problem  is  included 
in  the  solution  to  the  problem  of  completing  a  p-graph.  To 
further  that  solution  we  prove  the  key 
theorem: 

If  (G,N^)  and  (G,N2)  are  both  complete  p-graphs , 
then  (G,N2^hN2)  is  a  complete  p-graph. 
proof  (Millstein) : 

(G,N^nN2)  is  trivially  a  p-graph. 
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Suppose  it  is  not 
Then  there  exists 


complete, 
neG  such  that 


(i)  n^Nj^ONj  ;  and 

(ii)  there  exist  ,  with  paths  p^^  , 

P2  frcxn  f  <^2  *  respectively,  to  n  fiuch 
that  pj^  ,  p^  do  not  have  a  common  point  in 

N  nN  . 

1  2 

case  1: 


n  e  Nj  -  (N^nNj) 

Without  loss  of  generality  we  choose 

Pj^  ,  p^  to  be  cycleless  paths;  and 

(b)  to  be  the  last  points  in  N^nN^  on  paths 
t  P2  respectively;  and 

(c)  n  to  be  the  first  point  in  -  (n^oNj)  common  to 
both  paths.  (Note:  we  use  the  finiteness  in  the 
definition  of  p-graphs  in  making  these  choices.) 
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Now  (GfN^)  is  complete.  Also, 

n^N2  .  Hence  there  exists  n*eN2  3  Pj^  ,  P2  both  pass 
through  n'  .  Let  pj^  ,  p^  be  the  portions  of  Pj^  ,  P2 
between  ,  q2  and  n '  . 


Now  (G,Nj^)  is  complete.  Also,  q^^, q2eNj^nN2  ■Ni  . 
n'eN2  and  by  assumption  n'^N2nN2  (or  else  Pj^  ,  P2 
have  a  common  point  in  N^nN2  ,  contradicting  (ii)  above) . 
Therefore,  n'^N^^  (and  hence  n'^n)  . 

Therefore  there  exists  n"eNj^  such  that  pj^  ,  p^  both 
pass  through  n"  . 

Since  n‘^n  and  p^^  ,  P2  are  cycleless  paths,  n"4'n  . 
Therefore  n"  is  a  point  in  Nj^  -  (Nj^ON^)  common  to  both 
Pj^  ,  p^  and  n"^n  .  This  contradicts  (c)  above. 
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case  2; 

n  e  -  (N^ON^) 

By  symmetry  of  argument  this  case  leads  to  a  contradiction. 
case  3: 

neC{N^0N2) 

By  a  construction  similar  to  case  1  this  case  reduces  to 
case  1.  Hence  all  three  cases  lead  to  a  contradiction 
so  {G,N^.  N^)  is  complete. 

Our  main  result  is  contained  in  the 
corollary; 

If  M  is  an  arbitrary  subset  of  the  nodes  of  G  , 
there  exists  a  unique  minimal  set  N  of  nodes  of 
G  such  that  M^N  and  (G,N)  is  a  complete  p-graph. 

proof ; 

There  exists  at  least  one  set  with  the  required 
property;  take  all  nodes  of  G  .  Moreover,  since 
G  is  finite,  there  is  only  a  finite  number  of  sets 
with  the  required  property.  Therefore,  we  can  take 
the  intersection  of  all  such  sets  and  the  result  is 
the  required  minimal  N  . 


We  have  shown  that,  given  a  p-graph,  there  exists  a  unique 
minimal  completion  of  the  p-graph.  In  this  section,  we 
give  an  algorithm  for  computing  this  comnietion. 


We  have  defined  the  algorithm  in  a  rather  peculiar  notation 
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which  requires  some  justification.  The  essential  point 
is  that  the  algorithm  depends  on  cycling  through  the 
elements  of  a  set,  where  the  effect  of  processing  an 
element  may  be  to  append  other  elements  to  the  set. 

If  we  attempt  to  express  the  algorithm  ir  FORTRAN  or 
ALGOL,  we  are  forced  to  invent  a  data  structure  to  represent 
the  set;  perhaps  a  linked  list,  perhaps  a  bit  vector  to 
indicate  membership.  In  any  event,  we  find  ourselves 
making  a  decision  about  optimum  representation,  introducing 
new  symbolic  names  (for  the  list  head  and  pointers,  or 
for  the  bit  vector) ,  and  inventing  cyclic  controls  of  the 
loop-within-loop  type  which  are  more  complex  than  the 
simple  single  quantification  we  started  with. 

In  sum,  FORTRAN  or  ALGOL  representation  of  the  algorithm 
is  both  complex  enough  to  obscure  the  essential  logical 
structure  and  quite  arbitrary,  in  that  a  number  of  quite 
different-looking  algorithms  might  be  written  without 
logical  loss. 

We  have  elected  therefore  to  pay  the  price  of  an  unfamiliar 
notation,  in  the  hope  that  the  very  simple  expression 
which  results  will  disarm  the  reader's  discomfort  with  a 
novel  and  not  very  well-defined  language. 
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INITIAL  CONDITIONS 

We  imagine  as  given: 

1.  D  ,  a  constant  equal  to  the  number  of  nodes. 

2.  VAL(I)  ,  a  vector  where  1<I^D  .  VAL(I)  =  i 

if  the  I  node  of  the  given  p-graph  is  circled 
VAL(I)  =  0  ,  otherwise. 

3.  S(I)  ,  a  family  of  sets,  where  l^I^D  .  For 
any  I  ,  S(I)  is  the  set  of  nodes  which  are 
immediate  successors  of  the  I^^  node. 


TERMINAL  CONDITIONS 

1.  VAL(I)  =  I  ,  if  the  I  node  of  the  completed 

p-graph  is  circled;  otherwise  VAL(I)  =  j  , 
til 

where  the  J  node  is  the  last  circled  ancestor 
on  all  paths  to  the  I^^. 

2.  D  and  S(I)  are  unchanged. 

VARIABLES 

1.  I  ,  J  ,  and  Q  are  variables  which  assume 
integer  values. 

2.  NOTYET  is  a  variable  whose  value  is  a  set  of 
integers. 


Algorithm; 

COMPLETE 
ALPHA  $  Q-f-O. 

NOTYET  -<-{111^1^0}. 

(Vl|leNOTYETAVAL(I)  ^  0)(BLEEDCI)  .  NOTYET  NOTYET-{l}.) 
If  Q  ^  0  ,  (VljVALd)  it  I)  (VAL(I)-^  0)  .  GO  TO  ALPHA. 
EXIT. 

BLEED  (I) 

(Vj|  JeS  (I)  )  (FLOW{I,J)  . )  . 

EXIT. 

FLOW  (I,J) 

If  VAL(J)  =  0  ,  VAL{J)  ^  VAL(I)  .  EXIT. 

If  VAL(J)  =  VAL(I)  ,  EXIT. 

If  VAL(J)  =  J  ,  EXIT. 

VAL(J)  J  .  Q  ^  1  .  EXIT. 
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